written by Heidi Baumgartner and Rachel Martino at the Developmental Cognitive Neuroscience Lab at Brown University, directed by Dr. Dima Amso
Some eye-tracking (ET) systems collect independent data streams from each eye (left and right), but analysis protocols often require the user to choose one eye or the other for analysis purposes. In optimal circumstances, it makes little difference which eye is used because these data streams are nearly identical. Sometimes, however, the data stream from one eye is significantly more stable or accurate than the stream from the other eye (e.g., this situation occurs often when tracking very young participants with a between-eye distance that is at the lower limit of what the eye tracker can handle).
This toolbox is designed to quantify the TRACKING RATIO (percent of ET samples with non-zero gaze coordinates) and calibration ACCURACY (deviation of estimated point-of-regard from defined coordinates) by eye (R/L) for eye tracking data and to RECOMMEND which eye to use for subsequent analyses based on these metrics. The toolbox also generates a PRECISION metric (modeled after Wass et al., 2014), which indicates the amount of sample-to-sample jitter or noise that is present in the data for each eye. This metric is currently used for visualization purposes only, and is not included in the recommendation decision.
The toolbox is designed to work with data generated by BeGaze (SensoMotoric Instruments), but should work as long as the variables/format specified in Using a non-SMI system are present.
Refer to the ET Quality Toolbox USER GUIDE for instructions on using the toolbox to generate quality metrics and eye recommendations.
These are the experiment-specific parameters that must be defined by the user.
Once these parameters have been set, you should not have to edit code in other sections beyond tinkering with plot sizing.
## path to data directory
datadir <- "/Users/heidibaumgartner/Documents/GitHub/EyeTrackingQuality_Toolbox/ETQuality_toolbox/ExampleExpt_Data"
## file name for summary metrics
outputname <- "ExampleExp_ETQuality_output.csv"
## sampling rate of eye tracker (Hz)
EyeTrackHz <- 60
## threshold for meaningful tracking ratio difference (%)
Threshold_TrackingDiff <- 5 # %
## threshold for meaningful deviation difference (pixels)
Threshold_AccuracyDiff <- 25 # pixels
## threshold for meaningful precision difference
Threshold_PrecisionDiff <- 1
## define names of stimuli used to measure validation accuracy (case-sensitive)
ValidationStim <- c('validation1.jpg',
'validation2.jpg',
'validation3.jpg',
'validation4.jpg')
## AOIs for corresponding items in 'ValidationStim' (case-sensitive)
ValidationAOI <- c('validation1',
'validation2',
'validation3',
'validation4')
## X coordinates for center of corresponding items in 'ValidationStim'
ValidationX <- c(480,
1440,
480,
1440)
## Y coordinates for center of corresponding items in 'ValidationStim'
ValidationY <- c(270,
270,
810,
810)
## define stimulus names to be INCLUDED in tracking ratio/distance calculations
ExperimentalStim <- c('Pic1.jpg', 'Pic2.jpg', 'Pic3.jpg', 'Movie1.avi', 'Movie2.avi')
## define filler stimulus names to be EXCLUDED from tracking ratio/distance calculations
# FillerStim <- c('Validation.jpg', 'validation1.jpg', 'validation2.jpg', 'validation3.jpg', 'validation4.jpg', 'Fixation.png')
## to include ALL stimuli in tracking ratio/distance calculations, use this instead
# FillerStim <- c('nothing')
## save plots as individual files?
saveplotstofile <- 1 # 0=no, 1=yes
## Aesthetics for plots (can change as desired)
marker_colors <- c('#007c92', '#e98300', '#8c1515','#8c1515')
marker_fills <- c('#007c92', NA, NA, NA) # NA for no fill
marker_shapes <- c(21,21,4,4) # 21=circle, 22=square, 23=diamond, 24=triangle, 25=invertedtriangle, 4=X
marker_size <- 2 # size of markers on point plots
marker_stroke <- 2 # width of marker stroke on point plots
legend_size <- 3 # size of markers in plot legends
title_color <- '#8c1515'
title_size <- 16 # font size of plot titles
yaxis_size <- 12 # font size of y-axis titles
line_colors <- c('#68a2c1','#0a8002','808080')
## grayscale color options
# marker_colors <- c('gray', 'black', 'black', 'black')
# marker_fills <- c('gray', NA, NA, NA)
# title_color <- 'black'
## to include Precision chunks, prec_on = TRUE
## to exclude Precision chunks, prec_on = FALSE
prec_on = TRUE
[You shouldn’t need to change anything here unless you are using data generated outside of BeGaze and need to adjust for different variable names in data file(s).]
Tracking Ratio is calculated by dividing the number of samples with non-0 gaze data (TrackedSamples) by total number of samples for the subset of stimuli defined above (either for stimuli defined in ExperimentalStim list or for all stimuli other than those defined in FillerStim list), then multiplied by 100 to convert to percentage.
\[ TR = (\frac{TrackedSamples}{TotalSamples}) * 100 \]
## Subject BetterTracking TrackingDiff TrackingRatio_L TrackingRatio_R
## 1: P01 NA 0 100 100
## 2: P02 L 45 99 54
## 3: P03 R 25 63 88
## 4: P04 L 2 52 50
## 5: P05 NA 0 90 90
## TotalSamples NSamples_L NSamples_R
## 1: 2322 2322 2313
## 2: 2500 2480 1338
## 3: 2653 1659 2341
## 4: 5604 2912 2828
## 5: 3156 2846 2842
This plot provides a quick visualization of tracking ratio by eye for each subject.
This plot provides a visualization of tracking ratio by stimulus. This allows you to quickly see if tracking ratio differs or is consistent across stimuli. Each circle represents a single presentation of the stimulus. Data points are jittered so that if stimuli are presented more than once, you will be able to see the tracking ratio during each presentation.
NOTE: If stimuli do not repeat (1 trial per stimulus), this figure will be redundant with Figure 1c.
This plot provides a visualization of tracking ratio by stimulus. If stimuli are presented more than once, the data point represents the tracking ratio (by eye) averaged across all presentations of that stimulus.
NOTE: If stimuli do not repeat (1 trial per stimulus), this figure will be redundant with Figure 1b.
This plot provides a visualization of tracking ratio by trial. This allows you to quickly see if a subject’s tracking ratio is consistent or changes over time.
Deviation statistics are calculated using the longest fixation to the validation stimulus on each validation trial (based on the assumption that the longest fixation that falls within the validation AOI on each trial most likely reflects looking to the validation stimulus). The average euclidean distance (deviation) from each point-of-regard (POR) within the longest fixation to the validation stimulus is calculated for each trial. Deviation values are then averaged over all validation trials to generate an average deviation value.
\[\begin{aligned} Deviation_{sample} &= \sqrt{(STIM_{x}-POR_{x})^2+(STIM_{y}-POR_{y})^2} \\ \\ Deviation_{fixation} &= mean(Deviation_{samples}) \\ \\ Deviation_{average} &= mean(Deviation_{LongestFixations}) \end{aligned}\]## Subject BetterAccuracy AccuracyDiff Deviation_L stdev_L Duration_L
## 1: P01 L 20.944346 25.02673 12.873421 1716.050
## 2: P02 L 6.874652 43.72860 5.828800 1620.275
## 3: P03 L 7.902798 71.87286 12.531135 1591.075
## 4: P04 L 12.543479 23.42790 5.403864 2190.667
## 5: P05 L 11.713289 106.00528 7.858005 2168.400
## Deviation_R stdev_R Duration_R
## 1: 45.97108 8.626094 1607.75
## 2: 50.60325 10.235618 1536.95
## 3: 79.77566 9.758520 1666.05
## 4: 35.97138 8.848672 2752.30
## 5: 117.71856 10.834453 1934.85
This plot shows average deviation values (i.e., average distance of estimated POR from center of validation stimulus) by eye for each subject. Smaller deviation values indicate a better calibration/more accurate tracking.
This plot shows deviation values for each validation stimulus. This allows you to see at a glance if high deviations are due to a generally poor calibration (consistently high values across stimuli) or outlier value(s). It also allows you to see if any participants are missing data for one or more validation stimuli (missing data will result in a displayed warning before the plot).
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
This plot shows deviation values for each validation trial in the order in which they were presented. If validation stimuli are presented more than once (e.g. at beginning, middle, and end of experiment) this allows you to see at a glance if deviations are consistent or change (drift) over time.
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
Participants’ average distance from the screen (using eye position Z value) is calculated for each eye during samples in which non-zero gaze data was collected (exluding non-experimental stimuli). This is probably not interesting/useful for deterimining which eye to use for analyses, but you might want to use the distance measure from the chosen eye when calculating a group average.
Distance is in mm
## Subject ScreenDistance_R ScreenDistance_L
## 1: P01 644.6132 647.5126
## 2: P02 632.3205 627.6549
## 3: P03 605.0091 597.7266
## 4: P04 591.3879 583.8653
## 5: P05 542.5811 532.8221
This plot shows each subject’s average distance from the screen across all experimental trials.
The toolbox now uses the decision parameters outlined in the Overview section to recommend which eye provides ‘better’ data for each participant. The decision tree can be adjusted based on the priorities of the experiment (e.g., if tracking accuracy is less important than data quantity).
A flag of UNDETERMINED: missing data indicates that there were no fixations within the specified AOIs for the stimuli listed in ValidationStim, so the toolbox cannot make a recommendation based on tracking accuracy information. A flag of UNDETERMINED: conflict indicates that one eye has a better tracking ratio and the other eye has better accuracy, so the user should examine the data and make a determination of which eye to use.
This table is written to a .csv file and saved in the data directory.
Variable definitions:
## Subject EyeRec Match_TrackAccuracy BetterTracking
## 1: P01 Left 1 L
## 2: P02 Left 1 L
## 3: P03 Right 0 R
## 4: P04 Left 1 L
## 5: P05 Left 1 L
## TrackingDiffAboveThresh BetterAccuracy AccuracyDiffAboveThresh
## 1: 0 L 0
## 2: 1 L 0
## 3: 1 L 0
## 4: 0 L 0
## 5: 0 L 0
## TrackingDiff TrackingRatio_L TrackingRatio_R TotalSamples NSamples_L
## 1: 0 100 100 2322 2322
## 2: 45 99 54 2500 2480
## 3: 25 63 88 2653 1659
## 4: 2 52 50 5604 2912
## 5: 0 90 90 3156 2846
## NSamples_R AccuracyDiff Deviation_L stdev_L Duration_L Deviation_R
## 1: 2313 20.944346 25.02673 12.873421 1716.050 45.97108
## 2: 1338 6.874652 43.72860 5.828800 1620.275 50.60325
## 3: 2341 7.902798 71.87286 12.531135 1591.075 79.77566
## 4: 2828 12.543479 23.42790 5.403864 2190.667 35.97138
## 5: 2842 11.713289 106.00528 7.858005 2168.400 117.71856
## stdev_R Duration_R ScreenDistance_R ScreenDistance_L
## 1: 8.626094 1607.75 644.6132 647.5126
## 2: 10.235618 1536.95 632.3205 627.6549
## 3: 9.758520 1666.05 605.0091 597.7266
## 4: 8.848672 2752.30 591.3879 583.8653
## 5: 10.834453 1934.85 542.5811 532.8221
This is meant to provide a visualization of the distribution of differences between eyes for quantity and accuracy metrics to help the user select threshold values for what constitutes a meaningful difference (and thus should be taken into account when choosing which eye’s data to use for analyses).
Current threshold for a meaningful tracking ratio difference is 5%. Currently, 2 of 5 participants have a tracking ratio difference above this threshold.
Current threshold for a meaningful deviation value difference is 25 pixels. Currently, 0 of 5 participants have a deviation value difference above this threshold.
This metric was inspired by the toolbox created by Sam Wass and colleagues (Wass et al., 2014).
Currently, precision is included in the toolbox for visualization purposes only. This information is not integrated into the eye recommendation made above.
NOTE: The more data you have, the longer this chunk takes to run. To exclude this section, set prec_on parameter to FALSE
Precision is meant to be an indicator of how stable/jittery the estimated POR is over time. Perfectly precise tracking would be characterized by sustained periods of very small sample-to-sample changes in X/Y coordinates (fixations) separated by short periods of large changes (saccades) or missing data (blinks or looks away from the screen). Poor precision is charaterized by larger sample-to-samples changes in X/Y coordinates, even within fixations. In general, better precision is correlated with more accurate estimates of POR. The precision metric is an index of moment-to-moment stability in the location of the estimated POR (‘jitter’), and does not take stability of the track itself (‘flicker’) or missing data into consideration.
To calculate precision, looking data is broken into short time windows (window size defined in code, default is 100ms), calculates the median X/Y gaze coordinates within each window for each eye, and then computes a difference score for the X/Y coordinates at each time point relative to that window’s median value. Windows with missing data for >= half are dropped from the analysis. Precision scores for each variable (RightX, RightY, LeftX, LeftY) are calculated by finding the median difference score for each coordinate across all experimental stimuli, and then overall precision scores are calculated for each eye by averaging X and Y precision scores. Lower precision scores are better (i.e., more stable gaze)
\[\begin{aligned} MedianX_{window} &= median(PORX_{sample1},PORX_{sample2},...,PORX_{sampleN}) \\ MedianY_{window} &= median(PORY_{sample1},PORY_{sample2},...,PORY_{sampleN}) \\ \\ DifferenceX_{sample} &= \left\lvert PORX_{sample} - MedianX_{window} \right\rvert \\ DifferenceY_{sample} &= \left\lvert PORY_{sample} - MedianY_{window} \right\rvert \\ \\ JitterX &= median(DifferenceX_{sample}) \\ JitterY &= median(DifferenceY_{sample}) \\ \\ Precision &= median(\frac{JitterX + JitterY}{2}) \end{aligned}\]
## Subject BetterPrecision RPOR_precision LPOR_precision RPORX_diff_median
## 1: P01 L 1.6000 1.450 1.35
## 2: P02 L 1.8375 1.725 1.60
## 3: P03 L 1.5250 1.500 1.35
## 4: P04 L 0.6500 0.600 0.60
## 5: P05 L 1.0250 0.700 1.10
## RPORY_diff_median LPORX_diff_median LPORY_diff_median PrecisionDiff
## 1: 1.850 1.3 1.60 0.1500
## 2: 2.075 1.6 1.85 0.1125
## 3: 1.700 1.3 1.70 0.0250
## 4: 0.700 0.6 0.60 0.0500
## 5: 0.950 0.7 0.70 0.3250
Preciscion values for each eye at each time point are calculated by averaging X and Y difference values (differences between POR and smoothed gaze). Values close to zero indicate stable tracking (little to no difference between smoothed and unsmoothed values). Occasional spikes indicate large sample-to-sample changes (e.g., saccades) and will not affect overall precion values (which are based on medians, not averages).
Because experiments often have thousands of samples, it is generally not useful to plot the entire experiment at once. By default, this figure plots a 500-sample window starting at the sample number defined by xwindow_min. To adjust the starting point of the plotted window, edit xwindow_min and to adjust the length of the plotted window, adjust xwindow_max.
Plot of overall median precision values, by eye. Precision values close to zero indicate good overall precision, while higher values indicate less precision (more noise/jitter). Median precision values are derived from the difference between raw POR and smoothed window medians at each sample (see example window figure).
This plot shows LEFT EYE raw gaze (POR) X and Y coordinates and smoothed window median coordinates for an example time window. Differences between POR and window medians are plotted near the x-axis.
This plot shows RIGHT EYE raw gaze (POR) X and Y coordinates and smoothed window median coordinates for an example time window. Differences between POR and window medians are plotted near the x-axis.
Contact Heidi Baumgartner (heidibaum@gmail.com) with questions or with feature requests.
Special thanks to Kristen Tummeltshammer, Andrew Lynn, and other members of the DCN Lab at Brown University for help with testing and feature suggestions.